Instant OS Updates via Userspace Checkpoint-and-Restart
نویسندگان
چکیده
In recent years, operating systems have become increasingly complex and thus more prone to security and performance issues. Accordingly, system updates to address these issues have become more frequently available and increasingly important. To complete such updates, users must reboot their systems, resulting in unavoidable downtime and further loss of the states of running applications. We present KUP, a practical OS update mechanism that employs a userspace checkpoint-and-restart mechanism, which uses an optimized data structure for checkpointing on disk as well as a memory persistence mechanism across the update, coupled with a fast in-place kernel switch. This allows for instant kernel updates spanning across major kernel versions without any kernel modifications. Our evaluation shows that KUP can support any type of real kernel patches (e.g., security, minor or even major releases) with large-scale applications that include memcached, mysql, or in the middle of the Linux kernel compilation, unlike well-known dynamic hot-patching techniques (e.g., ksplice). Not only that, KUP can update a running Linux kernel in 3 seconds (overall downtime) without losing 32 GB of memcached data from kernel version v3.17-rc7 to v4.1.
منابع مشابه
Instant Recovery with Write-Ahead Logging: Page Repair, System Restart, and Media Restore
Traditional theory and practice of write-ahead logging and of database recovery techniques revolve around three failure classes: transaction failures resolved by rollback; system failures (typically software faults) resolved by restart with log analysis, “redo,” and “undo” phases; and media failures (typically hardware faults) resolved by restore operations that combine multiple types of backup...
متن کاملCheckpoint/Restart of Virtual Machines Based on Xen
System level virtualization provides several advantages: (i) customization is eased since virtual machines may be based on different systems; (ii) virtual machines are isolated from hardware, subsequently applications are isolated via the virtual machines; (iii) basic fault tolerance mechanisms – pro-active fault tolerance through virtual machine migration and virtual machine snapshot/restore; ...
متن کاملGreen Threads User level threading for a Library OS
Library operating systems have long been researched to study the impact of pushing many functionalities out of the kernel and into userspace, for the purposes of security, compatibility and performance. The Drawbridge[1] project demonstrated that a commercial, large operating system (OS) like Windows can be refactored into a Library OS supporting standalone apps, providing access to low-level r...
متن کاملLinux-CR: Transparent Application Checkpoint-Restart in Linux
Application checkpoint-restart is the ability to save the state of a running application so that it can later resume its execution from the time of the checkpoint. Application checkpoint-restart provides many useful benefits including fault recovery, advanced resources sharing, dynamic load balancing and improved service availability. For several years the Linux kernel has been gaining the nece...
متن کاملA Checkpoint and Restart Service Specification for Open MPI
HPC systems are growing in both complexity and size, increasing the opportunity for system failures. Checkpoint and restart techniques are one of many fault tolerance techniques developed for such adverse runtime conditions. Because of the variety of available approaches for checkpoint and restart, HPC system libraries, such as MPI, seeking to incorporate these techniques would benefit greatly ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016